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Abstract 



We derive a new representation for U- and V-statistics. Using this representa- 
tion the asymptotic distribution of U- and V-statistics can be derived by a direct 
■^ . apphcation of the Continuous Mapping Theorem. That novel approach not only 

Cd I encompasses most of the results on the asymptotic distribution known in literature, 

but also allows for the first time a unifying treatment of non-degenerate and degen- 
erate U- and V-statistics. Moreover, it yields a new and powerful tool to derive the 
ff^ I asymptotic distribution of very general U- and V-statistics based on long-memory 

^ ' sequences. This will be exemplified by several astonishing examples. In particular, 

we shall present examples where weak convergence of U- or V-statistics occurs at 
the rate a^ and a^, respectively, when a„ is the rate of weak convergence of the 
empirical process. We also introduce the notion of asymptotic (non-) degeneracy 
CO , which often appears in the presence of long-memory sequences. 

^^ ' Keywords: non-degenerate and degenerate U- and V-statistics, Hoeffding decom- 

position, von Mises decomposition, empirical process, weakly dependent data, strongly 
- • J _ dependent data, central and non-central weak limit theorems, Appell polynomials, strong 

V^ ' limit theorems 



1 Introduction 



T he study of the a symptotic d istrib ution of U- and V- (vo n Mises-) statistics goes back 



to IHalmosI (j 19461 ) , iHoeffdingj (j 19481 ) and Ivon MisesI (|l947l ) . Different approaches have 
been proposed to obtain the asymptotic distribution of these statistics. The most-used 
one is ce r tainly based on the H o effding decomposition of a U-stat is tic; s ee, for i nstance, 
Dehhngl ^2004 ). IPenkeil (Il985l'l. iKoroliuk and BorovskichI |l99J), Q (jl99nl ^. ISerfiing 
( 1980) ■ Recently, iBeutner and Zahld ( 2012 ) showed that the asymptotic distribution of 
U- and V-statistics car i be obtained by us i ng th e concept of quasi-Hadamard differen- 
tiability introduced in iBeutner and Zahld (|20ld ) . This concept led to new results for 



U- an d V-statistics based on weakly dependent data and was shown in IBeutner et al 



(|2012l ) to be even suitable for a certain class of U- and V-statistics based on long-memory 
sequences. However, a general result that allows to deduce non-central limit theorems 
for general U- and V-statistics based on long-memory sequences is still missing. This is 
due to the fact that for long- memory sequences several parts of the Hoeffd i ng de com- 
position may contribute to the limiting distribution; see lDehling and TaqquI (|l99ll ) and 
our discussion before Corollary 14.31 in Section [H 

In this article, we derive a new representation of U- and V-statistics. Based on this 
representation the asymptotic distribution of U- and V-statistics, subject to certain 
regularity conditions, can be inferred by a direct application of the Continuous Mapping 
Theorem. It turns out that the continuous mapping approach does not only cover the 
majority of the results known in literature and allows a unifying treatment of non- 
degenerate and degenerate U- and V-statistics (see also Section [2] for the definitions of 
non-degeneracy and degeneracy) , but also supplements the existing theorems for U- and 
V-statistics based on long-memory sequences. We shall further see that the continuous 
mapping approach allows us to establish strong laws for U- and V-statistics. Using the 
continuous mapping approach it will also be seen that, once the new representation is 
established, the asymptotic distributions of several degenerate U- and V-statistics that 
are usually derived on a case by case basis, are direct consequences of more general 
results. Finally, we will demonstrate that, under certain conditions on the kernel, the 
continuous mapping approach is also suitable to derive the asymptotic distribution of 
two-sample U- and V-statistics. 

To explain our approach, we first of all recall that U- and V-statistics (of degree 2) 
are nonparametric estimators for the characteristic 



^p(^) : = 



]{xi,X2) dF{xi)dF{x2) 



(1) 



of a distribution function (df) F on the real line, where 5 : M^ — )• M is some measurable 
function and it is assumed that the double integral in ([1]) exists. Given a sequence (Xj)jgN 



of random variables on some probability space {^,J-,¥) being identically distributed 
according to F, the V-statistic based on F„ is given by 

VgiF'n) = Ij g{xi,X2)dFn{xi)dFn{x2), (2) 

where Fn denotes some estimate of F based on Xi, . . . , X„, and it is assumed that the 
integral in ([2]) exists for all n G N. The corresponding U-statistic is given by 

^ ran 

^^'--■=^;u:^)T.T.9iX^.X,). (3) 

Assuming jj \g{xi,X2)\ dFn{xi)dF{x2) < oo, we obtain from ([2]) the decomposition 
Vg{Fn)-VgiF) = f gi^F{xi)dFn{xi)- f gi,F{xi)dF{xi) 

+ / g2,F{x2)dFn{x2) - / g2,F{x2) dF{x2) 

+ (f g{xi,X2)d{Fn-F){xi)d{Fn-F){x2) (4) 



with gi^ri) '■= j g{' -,^2) dF{x2) and 5'2,f(') := J g{xi, ■ ) dF(xi);JIliis_decom20sitionJs 
somet imes called von Mises decomposition of Vg{Fn)—Vg{F)] see lKoroliuk and Borovskich 



(11994 p. 40). If Fn is the empirical df F„ := ^ ^27=1 l[x„oo) of Xi, . . . , X„, then the first 
two lines and the third line on the right-hand side of ^ are the linear part and the 
degenerate part of the von Mises decomposition, respectively. In this case, the linear 
part of the von Mises decomposition coincides with the linear part of the Hoeffding de- 
composition of Ug^n — ^giP)j and the degenerate part differs from the degenerate part 
of the Hoeffding decomposition of Ug^n — ^gi^) o^ily by 

^ n n 1 " 

While the linear part can usually be treated by a central limit theorem (applied to the 
random variables Yi := gi^F{Xi)-\-g2^F{Xi), i £ N), it is exactly the degenerate part that 
causes the main difficulties in deriving the asymptotic distribution of U- and V-statistics. 
Now let us suppose that we may apply a one-dimensional integration-by-parts formula 
to the first two terms on the right-hand side of (HJ and a two-dimensional integration-by- 
parts formula to the third line on the right-hand side of (JH . Notice that this assumption 
in particular implies that gi^p and g2,F generate (possibly signed) measures on M and that 



g generates a (possibly signed) measure on M?. We then have the following representation 
(assuming that expressions like linix^oo{Fn — F)ix) gi^pi^) are equal to zero P-a.s.) 



Vg{Fn) - Vg{F) = - I [F„(xi-) - F{xi-)] dgi^Fixi) 

[Fn{x2-~) - F{X2-)] dg2,F{x2) 

+ lJ{Fn-F){xi-){Fn-F)ix2-)dg{xi,X2), (6) 



where we refer to the sum of the first two lines on the right-hand side as the linear part 
and to the last line on the right-hand side as the degenerate part of the representation. 
Of course, they coincide with the linear part and the degenerate part of the von Mises 
decomposition (jj]) . The representation ([6]) is the sum of the three mappings 

$i,g : V ^ R, $i,^(/) :=- J f{xi-) dgi,F{xi), 

$2,3 : V ^ M, ^2,g{f) ■■=- f /(^2-) dg2,F{x2). 

cD3,3:V^R, ^^^g{f):= jj f{xi-)f{x2-)dg{xi,X2) (7) 

applied to F^ — F, where V is some suitable space consisting of cadlag functions on M. Of 
course, if g is symmetric then ([6]) can be represented using two mappings only. Now, on 
one hand, if the functions gi^F, ^ = 1,2, generate finite (possibly signed) measures on R, 
and if g generates a finite (possibly signed) measure on R^, then the mappings $i,g, i = 
1, 2, 3, are continuous if we endow V with the uniform sup-metric d^oif, h) := ||/ — /i||oo- 
On the other hand, if the (possibly signed) measure generated by gi^F is not finite but 
only (T-finite, then the map ^i^g is obviously not continuous w.r.t. the uniform sup- 
metric doo- However, if we assume, for example, j{l/(t){x)) \dgi^F\{x) < oo for i = 1,2, 
where (/) : R — )• [l,oo) is any continuous function, and \dgi^F\ denotes the total variation 
measure generated by gi^F, then we still have ^i^g{fn) -^ ^i,g{f) for i = 1,2 when the 
sequence (/„) converges to / in the weighted sup-metric d^{f,g) := \\{f — /i)0||oo- If in 
addition ff {4){xi)(l){x2))~^ \dg\{xi,X2) < oo, then we also have $3,g(/ra) — >• <l>3,g(/) when 
the sequence (/„) converges to / in the weighted sup-metric d^, and \dg\ denotes the 
total variation measure generated by g. That is, under appropriate conditions, we have 
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an{Vg{Fn) - Vg{F)) = ^$.,g(a„(F„ - F)) + $3,g(V^(F„ - F)) (8) 



i=l 



with continuous mappings $i,g, i = 1,2, 3, and a„ a strictly positive real number. There- 
fore, once the representation ^ has been established, one only needs weak convergence 



of the process an{Fn — F) w.r.t. the weighted sup-metric d^ to make use of the Contin- 
uous Mapping Theorem. The latter is not problematic. For instance, weak convergence 
of empirical processes w.r.t. weighted sup- r netric s has been established under various 
condit i ons; see, for instance, iBeutiier et al. J 20121), Chen and Fan J 20061 ). IShao and Yu 
(Il99d l. IShorack and Wellnerl dlQSfil l. Iwul (J200.i boosl ^. lYukichl (Il992l l. One of the ad- 
vantages of this approach lies in the fact that weak convergence of an{Fn — F) implies 
that y/a^{Fn — F) converges in probability to zero, and hence that ^■i,g{\/<hi{Fn — F)) 
converges in probability to zero. Thus, with the continuous mapping approach we can 
easily deal with the degenerate part of a non-degenerate V-statistic. For a degenerate 
V-statistic the linear part vanishes, i.e. ^i^g = ^2,g = 0, and so in this case ([8]) multiplied 
by a„ reads as 

ai{Vg{F^) - Vg{F)) = $3,,(a„(F„ - F)). (9) 

That is, with the continuous mapping approach we can also easily deal with the degen- 
erate part of a degenerate V-statistic. Moreover, the continuous mapping approach can 
also provide a simple way to derive the asymptotic distribution of a V-statistic when 
both terms of the von Mises decomposition contribute to the asymptotic distribution, 
or when the scaling sequence has to be chosen as the cube (a^) or the fourth power 
(a^) of the scaling sequence (a„) of Fn — F. This will be illustrated in Section H] in the 
context of long-memory data; see Examples 14.71 14.81 and 14.91 In this context the notion 
of asymptotic degeneracy, to be introduced in Section [2l plays a crucial role. 

Next, let us briefly discuss the relation between the asymptotic distribution of a U- 
statistic Ug^n with that of the V-statistic V^(F„), where as before -F„ denotes the empirical 
df. Since the linear parts of the von Mises decomposition and the Hoeffding decomposi- 
tion coincide, and the degenerate parts of these decompositions differ only by the term in 
([5]), it can be shown easily that a non-degenerate V-statistic Vg{Fn) w.r.t. {g, F) and the 
corresponding non-degenerate U-statistic Ug^n have t he same asymptotic distribution if 
a„ = o{n) and E[|5(Xi,Xi)|] < oo; see, for example. iBeutner and Zahld (J2012l . Remark 
2.5). Hence, in the non-degenerate case the asymptotic distribution of both U-statistics 
and V-statistics can be derived from ([6|). On the other hand, in the degenerate case they 
differ by a constant if a LLN holds for Y27=i di-^ii -^i)- ^^ fact, this follows from 



n(C/g,„ - VgiF)) 



n{Vg{Fn) - Vg{F)) 



+- 



'9 

n 



n 



1 
— In ^-^ 



n 



-{Vg{Fn)-Vg{F)) + 



n 



n 



-y^in 



It is worth pointing out that for kernels g that are locally of unbounded variation the 
asymptotic distribution of the corresponding U-statistic cannot be obtained by the con- 
tinuous mapping approach; see Remark l3.28[ However, if such a kernel is non-degenerate 



a non-trivial limiting distribution of the corresponding U-statistic can be obtained ei- 



ther b y the Hoeffding decomposition or by using the approach of iBeutner and Zahle 



(|2012l ) that is based on the concept of quasi-Hadamard differentiability and the Modi- 



fied Functional Delta Method. I n case that the kernel is lo cally of unbounded variation 



and degenerate, the approach of iBeutner and Zahld (|2012l ) also yields little. However, 



then the traditional approach to degenerate U- and V-statistics that is briefly recalled 
after Example 13.181 may lead to a non-trivial limiting distribution. 

The rest of the article is organized as follows. In Section [21 we will discuss the no- 
tion of (non-) degenerate and asymptotically (non-) degenerate U- and V-statistcs. In 
Section [3l we will first give conditions on the kernel g, the df F and the estimator F^ 
that ensure that the representation ([6]) holds (Section 13. ip . Thereafter we will give in- 
teresting examples for kernels g that satisfy this conditions (Section 13. 2p . we will apply 
the continuous mapping approach to derive weak and strong limit theorems for U- and 
V-statistics based on weakly dependent data (Sections 13.31 and I3.4p . and we shall discuss 
some extensions and limitations of our approach (Section 13. 5p . In Section U the whole 
strength of our approach will be illustrated by deriving non-central limit theorems for 
U- and V-statistics based on strongly dependent data. The representation Q along 
with a new non-central limit theorem for the empirical process of a linear long-memory 
process offers a very simple way to deduce such non-central limit theorems. We will 
present in particular three astonishing examples. In Example 14.71 both terms of the von 
Mises decomposition of a non-degenerate V-statistic contribute to the asymptotic dis- 
tribu tion whatever the tr u e df F is (in the Gaussian case this example is already known 



from iDehling and TaqquI (jl99ll . Section 3)), and in Examples 14.81 and 14.91 the scaling 
sequences for degenerate V-statistics are given by (a^) and (a^), respectively, (and not 
as usual by the square (a^)) of the scaling sequence {an) of Fn — F. 

2 The notions of (non-) degeneracy and asymptotic (non-) 
degeneracy 

In this section we will recall the notion of (non-) degenerate U- and V-statistics, and we 
shall introduce the notion of asymptotically (non-) degenerate U- and V-statistics. We 
will restrict to the case where g, F and Fn admit the von Mises decomposition Q. 

The corresponding V-statistic Vg{Fn) will be called non- degenerate w.r.t. {g,F), if the 
linear part of the von Mises decomposition (i.e. the sum of the first two lines on the 
right-hand side in @) does not vanish. The corresponding V-statistic Vg{Fn) will be 
called degenerate w.r.t. {g, F) if the linear part of the von Mises decomposition vanishes, 
i.e. if Yli=i j 9i,F d{Fn—F) = P-a.s. for every n E N. This condition holds in particular 
if gi^F = 92, F = 0, or if Fn is a (random) df and both gi^p and g2,F are constant. If 



non-degenerate 


Gini's mean difference 


Example 13. 101 




variance 


Example 13.111 


degenerate 


Gini's mean difference 






(uniform two-point distribution) 


Example 13.121 




variance 






(4th central = squared 2nd central moment of F) 


Example I3.20lfi) 




Cramer-von Mises 


Example 13. 131 




test for symmetry 


Example 13. 141 



Table 1: Examples for non-degenerate and degenerate V-statistics w.r.t. {g,F). 



the linear part of the von Mises decomposition does (not) vanish when Fn = Fn, then 
we also call the corresponding U-statistic Ug^n (non-) degenerate w.r.t. {g,F). Recall 
that it is very common, mainly in the i.i.d. set-up, to call a U-statistic degenerate 
if Yar[gi^p{Xi)] = for i = 1,2. Notice that, in this case, this is in line with the 
convention used here. Indeed, it is easily seen that YaT:[gi^F{Xi)] = is equivalent to 
/ gi,F d{Fn — F) = P-a.s. if Fn is based on an i.i.d. sequence. Table [1] displays some 
examples for non-degenerate and degenerate U- and V-statistics. 

To introduce the notion of asymptotically (non-) degenerate U- and V-statistics, we 
let {an) C (0, oo) be a scaling sequence such that an{Fn — F) converges in distribution to 
a non-degenerate limit. The representation ([8|) indicates that for every (non-degenerate) 
V-statistic Vg{Fn) (w.r.t. {g, F)) only the linear part of the von Mises decomposition may 
contribute to the limiting distribution of an{Vg{Fn) — Vg{F)). If there is a nontrivial 
limiting distribution of the linear part weighted by a^, then we call the V-statistic Vg{Fn) 
asymptotically non- degenerate w.r.t. {g,F, {on)), and the analogous terminology is used 
for U-statistics. Of course, every asymptotically non-degenerate U- or V-statistic w.r.t. 
{g, F, (on)) must also be non-degenerate w.r.t. {g, F). However, it might happen that the 
limiting distribution of the linear part weighted by a„ vanishes. In this case, we call the 
V-statistic Vg{Fn) asymptotically degenerate w.r.t. {g,F, (a^)), and again the analogous 
terminology is used for U-statistics. Of course, every degenerate U- or V-statistic w.r.t. 
{g,F) is also asymptotically degenerate w.r.t. {g,F, (an))- 

For an asymptotically degenerate U- or V-statistic w.r.t. {g,F,{an)) a nontrivial 
asymptotic distribution can typically be obtained by weighting the empirical difference 
by On instead of a„, i.e. by considering the limiting distribution of a^(V^(F„) — Vg{F)). 
In this context two different things may occur: 



1) The asymptotic distribution of a^(V^(F„) — Vg{F)) is non-trivial. In this case, we 
say that the asymptotically degenerate U- or V-statistic w.r.t. (^f, F, (a„)) is of type 



1. 



2) The asymptotic distribution of a^(V^(F„) — Vg{F)) is still degenerate. In this case, 
we say that the asymptotically degenerate U- or V-statistic w.r.t. {g,F, (an)) is of 
type 2. 

It seems that behavior 2) only appears in the presence of long-memory sequences; for 
examples see Examples 14.81 and I4.9[ It is worth pointing out that for an (asymptotically) 
degenerate U- and V-statistics of type 2 a nontrivial limiting distribution can sometimes 
be obtained by considering the limiting distribution of an{Vg{Fn) — Vg{F)) for some 
p > 2; see again Examples 14.81 and 14.91 In case 1) we can distinguish between the 
following three cases: 

l.a) Only the degenerate part of the von Mises decomposition contributes to the limiting 
distribution of a'^{Vg{Fn) — Vg{F)). This is in particular the case, if the U- or V- 
statistic is even degenerate w.r.t. {g,F). 

l.b) Only the linear part contributes to the limiting distribution of a^(V^(F„) — Vg{F)). 
This can happen only if the U- or V-statistic is asymptotically degenerate w.r.t. 
{g,F,{an)), but non-degenerate w.r.t. {g,F). 

l.c) Both the linear and the degenerate part contribute to the limiting distribution 
of a'^{Vg{Fn) — Vg{F)). Again, this can occur only if the U- or V-statistic is 
asymptotically degenerate w.r.t. {g,F, (an)), but non-degenerate w.r.t. {g,F). 

In the original version of the manuscript we guessed that the cases l.b) and l.c) only 
appear for U- and V-statistics based on long-memory sequences. However, a referee 
provided us with the following example that shows that behavior l.c) also occurs for 
TTi-dependent sequences. 

Example 2.1 Let (^n) and ((5„) be independent i.i.d. sequences with ¥[^i = 0] = P[^j = 
1] = 1/2 and ¥[6i = 0] = F[6i = 1] = 1/2. Define the 1-dependent sequence (Zj) by 
Zi '■= S,i — Ci-1 and the 1-dependent sequence (Xi) by 



X 



V2 


Zi 


= 1 and bi 


= 1 


1 


Zi 


= and bi 


= 1 





Zi 


= -1 




-1 


Zi 


= and bi 


= 


-V2 


Zi 


= 1 and bi 


= 



We then have /i := K[Xi] = 0, a^ := Var[Xi] = 1, and {Xi - /i)^ - a^ = Zi. Denote 
by a^ the sample variance, which is the V-statistic Vg{Fn) with kernel g{xi,X2) = (1/ 



2)(xi — X2)^- The Hoeffding decomposition of a^ — a"^ is given by 

"2 






n \n 

i=l 



..l^ 



where the latter identity follows from {Xi — fi)'^ — a'^ = Zi, "^^^i Zi = ^n — Co) and // = 0. 
Thus we have 

and so we obtain from the central limit theorem along with Slutzky's lemma, and the 
fact that S,n — Co has the same distribution for every n G N, that a^ is non-degenerate 
w.r.t. {g,F), but asymptotically degenerate w.r.t. {g,F, {^/n)), where F refers to the df 
of the Xi. On the other hand, we have 



1 " 



n . 
1=1 



As already mentioned ^n—^o has the same distribution for every n € N, and (^ XlILi ^«)^ 
converges in distribution to a x^ distribution with one degree of freedom. Since in — Co is 
the linear part and (-^ Y17=i ^«)^ the degenerate part of the Hoeffding decomposition, 
the example shows that even for ?7i-dependent sequences both terms of the Hoeffding 
decomposition may contribute to the limiting distribution. O 

Table [21 displays some examples for asymptotically non-degenerate and degenerate U- 
and V-statistics for a linear process with long-memory. It is worth mentioning that 
there is a difference between the asymptotic degeneracy of the variance in the case of a 
linear long- memory sequence and the m-dependent sequence of Example 12.11 For a long- 
memory sequence the variance is asymptotically degenerate whatever the underlying df, 
whereas for an m-dependent sequence it does depend on the underlying df whether the 
variance is asymptotically degenerate or not. 

3 Representation (^: conditions, examples, and applications 

Let D be the space of all bounded cadlag functions on M. Any metric subspace (V, d) of 
D will be equipped with the a-algebra V := Pn V to make it a measurable space, where 
D is the cj-algebra generated by the usual coordinate projections vTa; : D — t- M, x G M. 



asymptotically 
non-degenerate 


Gini's mean difference {p = 1) 


Disc, before Coroll.|4.3| 


asymptotically 
degenerate - type 1 


l.a) Cramer-von Mises {p = 2) 
l.b) squared absolute mean (p = 2) 
l.c) variance (p = 2) 


Disc, before Coroll.|4.3| 
Example |4.6| 
Example I4.VI 


asymptotically 
degenerate - type 2 


some artificial kernel {p = 3) 
test for symmetry {p = 4) 


Example |4.8| 
Example 14.91 



Table 2: Examples for asymptotically non-degenerate and asymptotically degenerate U- 
and V-statistics w.r.t. {g, F, (an)) for a„ := nP^I^-'^/'^k{n)-P with p{2p - 1) < 1, 
where the observations are drawn from a linear process Xt := Xls^o ^s^t-s with 
Qg = s~^i{s) for some /3 E (2, 1) and some slowly varying £ (long-memory). 



The roles of V and d will often be played by the space D^ of all / G D with ||/(/'||oo < 00 
and the weighted sup-metric d(i,{f, h) := \\{f — h)(j)\\oo, respectively, where : M — )• [1, 00] 
is any continuous function being real-valued on M (henceforth called weight function) 
and where we use the convention • 00 := 0. We will frequently work with the particular 
weight function ^A(a^) := (1 + la^l)'** for fixed A. 

Further, let BVioc,rc be the space of all functions on M that are right-continuous and 
locally of bounded variation, and notice that every function in BViocrc has also left-hand 
limits. For ip £ BVioc,rci we denote by dtp^ and dip~ the unique positive Radon measures 
induced by the Jordan decomposition of ip and we set \dip\ := dtp^ + d'ip~ . Analogously, 
let BVj^Qj, j,^ be the space of all functions on M^ that are upper right-continuous and locally 
of bounded variation, and for r G BVi^qj,j.j., dr^, dr^ and |dr| are defined analogously 
to dip'^ , dip" and {dipl; for details see the discussion subsequent to Remark 13.51 below. 
We shall interpret integrals as being over the open intervals (—00,00) and (—00,00)'^, 
i.e. J = Ji^ OQ-) and fj = JJ(_^ ^\2- Moreover, for a measurable function / we shall 
say that the integral of / w.r.t. a signed measure fi exists if the four integrals f f~^ dfi^, 
J f~dfi^, J f^ dn~ and J f~ dfj,~ are all finite, where /^ and /~ denote the positive 
and the negative part of /, and ^+ and fi~ denote the positive and the negati ve part of 
fi. We denote by — > convergence in distribution in the sense of lPollardI (jl984l ). and the 
Borel (T-algebra in M is denoted by 



3.1 Conditions for the representation ([5]) 

In this section, we provide conditions on g, F and the estimate F^ of F under which the 
representation ([6|) holds true. First of all we impose assumptions on g, F and Fn that 
ensure that Vg{F) and Vg{Fn) are well defined. 
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Assumption 3.1 The integral in HP exists, the estimate Fn of F is a non- decreasing 
cadlag process with variation bounded by 1, and for all n G N we have that F-a.s. 

// \g{xi,X2)\dFn{xi)dFn{x2) < oo. 

A further minimum requirement for the representation ([6]) is the fohowing. 

Assumption 3.2 For all n G N we have that F-a.s. 

\g{xi,X2)\dFn{xi)dF{x2) < oo and // \g{xi,X2)\dF{xi)dFn{x2) < oo. 



Notice that the conditions on F„ imposed by Assumptions 13.1143.21 are always fulfilled 
if the integral in ([1]) exists and F„ is the empirical df -F„- From Q and Lemmas 13.41 and 
3.61 below, we immediately obtain the following theorem. 



Theorem 3.3 // the assumptions of Lemmas \3.4\ and \3.b\ (below) are fulfilled, then the 
representation ^ of Vg{Fn) — Vg{F) holds true F-a.s. for every n G N. 

The following lemm a gives conditions th a t allo w to apply almost surely an integration- 



by-parts formula (see iBeutner and Zahld (J2012l . Lemma B.l)) to the first and to the 



second line on the right-hand side in ([3]). 

Lemma 3.4 Suppose that 
(a) Assumptions [3l\^lM hold, 
(h) gi^F G BViocrc, 

(c) J \Fn{x-) - F{x-)\ \dgi^F\{x) < oo F-a.s., for all n e N and i = 1,2, 

(d) lim|2.|_^oo(F„ — F){x) gi^p{x) = F-a.s., for all n £ N and i = 1,2. 
Then F-a.s., for every n £ N, 

g{xi,X2)dFn{xi)dF{x2) - II g{xi,X2)dF{xi)dF{x2) 

= - I [Fn{xi-) - F{xi-)] dgi^F{xi), 

and 



g{xi,X2)dF{xi)dFn{x2) - jj g{xi,X2)dF{xi)dF{x2) 

[Fn{x2-) - F{x2-)\ dg2,F{x2)- 
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Proof We only prove the first equation. From Assumptions 13.11 and 13.21 and using 
Fubini's theorem, we have that both 

/ \9i,F{xi)\dFn{xi) and / \gi^F{xi)\dF{xi) (10) 

exist, and so J \gi,F{xi)\ |d(F„, — F)\{xi) exists. Moreover, for every n G N, we obtain 
by using Fubini's theorem 



g{xi,X2)dFn{xi)dF{x2) - II g{xi,X2) dF{xi)dF{x2) = / gi^rixi) d{Fn - F){xi). 
Since by assumption (c) we furt her have that f \Fn ( xi—) — F{xi—)\ \dgi^p\{xi) exists, 



the conditions of Lemma B.l in iBeutner and Zahld (|2012l ) are fulfilled and the result 



follows. □ 

Remark 3.5 (i) For F„ = Fn (recall that Fn denotes the empirical df) conditions (c) 
and (d) of Lemma 13.41 boil down to conditions on the tails of F and gi^p, i = 1,2. 

(ii) More generally, if for P-almost every uj there exist real numbers x^ (w) and Xu{oj) 
such that Fn{uj, x) — F{x) = —F{x) for all x < Xi{uj), and Fn{uj, x) — F{x) = 1 — F{x) for 
all X > Xuiuj), then again conditions (c) and (d) of Lemma 13.41 boil down to conditions 
on the tails of F and gi^p, i = 1,2. 

(iii) Condition (c) holds if J dgi^p exists for i = 1,2, and under the conditions of part 
(ii) of this remark we have that condition (d) holds if ||(7i,_F||oo < oo for i = 1,2. 

(iv) If for some weight function (j) we have that d^{Fn,F) is P-a.s. finite for every 
n G N, and that J l/(j)\dgi^F\ < co and lini^x\^cx)9i,F{x)/(j){x) = for i = 1,2, then 
again conditions (c) and (d) of Lemma 13.41 hold. Notice also that under the conditions 
of part (ii) of this remark, the condition that dfp{Fn,F) is P-a.s. finite for all n S N is a 
condition on the tails of F. 

(v) If there are x^^j < Xu,i such that \gi^F\ is non-increasing on (— oo,X£ j] and non- 
decreasing on [xu,i, oo) for i = 1,2, then part (d) of Lemma 13.41 is already implied by 
Assumptions [3TT] and [3]2j Indeed, under these assumptions the integrals in (jlOp exist, 
and we have for x > x^i 



g^,F{x)iFnix) - F{x))\ = / gi^pix) dF{t) - gi^F{x) dFn{t) 



X J X 

oo />oo 



< 



/ \gi,F{t)\dF{t)+ hAt)\dFnit) 

J X J X 



for i = 1,2. Thus, limj;^oo(-^n — F){x)gi,F{x) = for i = 1,2. Analogously we obtain 
lim^^_oo(-Fn - F){x)gi^F{x) = 0. O 



12 



For the functional $3^^ to be well defined in the Lebesgue-Stieltjes sense the kernel 
g must be upper right-continuous and locally of bounded variation. For later use and 
the reader's convenience we recall the definition of locally bounded variation. For any 
function r : M^ — ^ M, set 

^J'r{'R■{x^,x2),{yl,y2)) '■= 'r{x2,y2) - 'r{xi,y2) - T{x2,yi) + T{xi,yi) 

for every half-open rectangle 'Tl(x^,x2),{yi,y2) = (^^I'^s] x (2/1,2/2] with (xi,X2) G M^, 
xi < X2, and (2/1,1/2) G I^^, 2/1 < 2/2- For a fixed half-open rectangle TZ = '/^(ai, 02), (61 ,62) ~ 
(01,02] X (61,62] in IK^, a pair P of finite sequences (xfc)fc=o,...,n and (2/£)^=o,...,m is called 
a grid for 7?. if ai = xq < xi < . . . < x„ = 02 and 61 = 2/0 < 2/i < • • • ^ 2/m = 62- For any 
grid P, let 

n m 

i=i j=i 

n m 

= X^X^I'^^^^'^j) -"^(^i-i'^j) -7"(a;i,2/j-i) + T(xi_i,yj_i)|. 
i=i j=i 

Moreover, let Vr('7^) := suppg-p V(i-', r), where V is the set of all grids for IZ. The 
function r is said to be locally of bounded variation if for every bounded half-open rect- 
angle 7^ C M^ we have VriJZ) < 00, and r is said to be of bounded total variation if there 
is a constant C > such that VriTZ) < C for all bounded half-open rectangles 7?. C M^. 
As mentioned earlier, BV^^^^ ^^ denotes the space of all upper right-continuous functions 
r : M^ — 7- M that are locally of bounde d variation, and we use the tw o-dimensional Jor- 



dan decomposition (see, for instance, iGhorpade and Limavd (|2010|, Proposition 1.17)) 



to define dr"^, dT~ and \dT\ similar as dip^ , d'ip~ and {dipl- We can now state the two- 
dimensional integration-by-parts lemma, which can almost surely be applied to the third 
line on the right-hand side in Q. 

Lemma 3.6 Suppose that 

(a) Assumption \3. 1\ holds. 

(h) (/ G BV^oj. J.J., and the functions gxi{-) :=g{xi,-) and gx2{-) '■= g{-,X2) are locally of 
bounded variation for every fixed xi and X2, respectively, 

(c) If \Fn{xi-) - F{xi-)\ |F„(x2-) - F{x2-)\ \dg\{xi,X2) < 00 F-a.s., for all n G N, 
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lim 



lim 

ai,bi—>—oo, a2,b2—^oo 



(d) the following limits exist and equal zero P-a.s. for all n £ N; 

\f„. - F){b2) !:i{F^ - F)(xi-) dg,,{x^) 
-{Fn - F){h) Q{F^ - F){xi-)dg,,{xi) 
'{Fn - F){a2) !i'^{Fn - F){X2-) dga2{x2) 
-{Fn - F){ai) j,'^{Fn - F){X2-) dg^, (xs)" 
lim \{Fn - F){a2) (F„ - F){b2) 5(02,62) 

-{Fn-F){ai){Fn-F){b2)g{ai,b2) 

-{Fn - F){a2) {Fn - F){bi) g{a2M) 

+ {Fn - F){ai) {Fn - F){bi)g{ai,bi 

Then P-a.s., for every n G N, 

g{xi,X2) d{Fn - F){xi)d{Fn - F){X2) 

{Fn- F){xi-){Fn- F){x2-)dg{xi,X2). 



In part (d) of the lemma the expression "limaj^ft^_^_oo,a2,62->-oo(- • •)" is understood 
as convergence of a net (• • •){ni,n2,n3,n4)eN4; with (—01,02,-61,62) playing the role of 
(ni,n2,n3,n4), where as usual N^ is regarded as a directed set w.r.t. the relation 0, and 
(rrii, ?n,2,?7i3, 771,4) (ni,n2,ra3,n4) means that rrii < rii for i = 1,...,4. The analogous 
interpretations are used for the other limits. 



Wt 



Proof (of Lemma 13. 6p For two functions f,h G jlc v j^^. j.^, 
(ai,a2] X (61,62] we have 



and every fixed rectangle 



02 rb2 



ai 



f{xi,X2)dh{xi,X2) 



a2 rb2 

/ /i(xi-,X2-)(i/(a;i,2;2) 

ai Jbi 

02 rb2 

h{xi- ,b2) dfi,^{xi) - / h{a2,X2-)dfa2{x2) 

ai J bi 

02 rb2 

h{xi-,ai)dfai{xi)+ / h{bi,X2-)dfb^{x2) 

ai J bi 

+/(a2,62)/i(a2,62) - /(a2,6i)/i(a2,6i) 
-f{ai,b2)h{ai,b2) + /(ai, 6i)/i(ai,6i); 



see 



Gill et al.l (jl995l . Le mma 2.2). The remaining part of the proof is then similar to the 
proof of Lemma B.l in lBeutner et al.l (|2012l ). □ 
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Remark 3.7 (i) Again, if for P-almost every cj there are xii{ijj) and Xu{oj) as in Remark 
I3.5l (ii). then the conditions in part (c) and (d) of Lemma 13.61 reduce to conditions on 
F and g. Furthermore, if such X£(w) and Xu{oj) exist, then conditions (c) and (d) of 
Lemma Els hold whenever _/J |(i5f| < oo as well as sup^.g]g J \dgx^^ < oo for i = 1, 2. 

(ii) If for some weight function (j) we have that d^{Fn, F) is P-a.s. finite, that the inte- 
gral JJl/{(j){xi)<j){x2)) \dg\{xi,X2) is finite, that limx^^±oo'^/(|){xi) f l/4>{x)\dgx^\ix) = 
holds for i = 1,2, and that g{xi,X2)/{4>{xi)(j){x2)) converges to zero as |xi|,|x2| — ?■ oo, 
then again conditions (c) and (d) of Lemma 13.61 hold. 

(iii) Right-continuity of gxi and gx2 1 which is needed for the integrals in part (d) of 
Lemma 13.61 to be well defined, is implied by right-continuity of g. 

(iv) It is worth pointing out that gx^ and gx2 being locally of bounded variation does 
not imply that g E BVi^oj.j,j.; see Remark 13.281 below. Moreover, g S BV^qj.j.^ does not 
imply that gx^ and gx2 are locally of bounded variation. Take, for example, the function 
g : [0,1] X [0,1] ^ M defined by 5(0, 0) := and c/(xi,X2) :=xisin(^), (xi,X2) / (0,0). 

(v) If gf : M^ — 7- M is continuous, the partial derivative dg/dxi exists and is continuous, 
and the mixed partial derivative d'^ g / {8x18x2) exists and is boui ided on every rectangle 



TZ C M , then g is locally of bounded variation; see, for instance. iGhorpade and Limave 



mm . Proposition 3.59). 

(vi) If for all (xi,X2), {yi,y2) G I^^ with xi < X2 and yi < y2 we have that 

9{x2,y2) +g{xi,yi) >g{x2,yi) + g{xi,y2), (11) 

then g is locally of bound ed variation. The same claim holds, if we have < instead of > 



in ([TT]) . See, for instance. IGhorpade and Limavd (J2010l . Proposition 1.15). <> 



3.2 Examples for the representation ([5]) 

In this section, we give some examples for set-ups under which the representation ([6]) 
holds. In the third, fourth and fifth example the set-up is degenerate because there 
gip = 32, F = 0. Before turning to the examples, we state two remarks including some 
notation needed for the examples. 

Remark 3.8 Recall that two functions /i, /2 G IK^ioc re generate the same measure on 
M^ if /i(xi,X2) = f2{xi,X2) + /ii(xi) + h2{x2) for some functions /ii,/i2 : M — >• M. O 



Remark 3.9 For any positive measure /x on M, and any measurable function w : 
M+, define the measure Ti^,^ on E? by 



KJA) := / wix)S^x,.M)f,{dx), A G ^(R2) (12) 
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with 6fx,x) the Dirac measure at (x, x) £M?. In particular, for Ti}^ -integrable / : M? 
we have 

f{xi,X2)'Hl,^^{d{xi,X2)) = I w{x)f{x,x)n{dx). (13) 



The 7^^^^-measure of the area of a rectangle TZ intersecting the diagonal D = {{x,x) : 
X G M} is equal to the integral J^ w{x)^{dx), where TZ-,^ is the projection of TZ on one 
of the axes of that piece of the diagonal D that is contained in the rectangle. So one 
easily sees that for every IZa^b = (oi, 02] x (^i, ^2] 






If w = 1 and // is the Lebesgue measure I on R, then T^i,^^ coincides with the one- 
dimensional Hausdorff measure Ti} in M^ restricted to the diagonal D and weighted by 
the constant l/\/2, i.e. with 'l-0-{ ■ n D)/\/2. In this case, we also write Ti]-) instead of 
1-0^ £. As special cases of (fTlI|) and P^ we obtain 



// f{xi,X2)U\){d{xi,X2)) = I f{x,: 



^ . . -. -,. ^ , sx) dx 

and 

1 J min{a2,62} -max{ai,6i} , TZa,h<^D^% 

^^^^"'^^ - \ , else 

Analogously, we let 1-0- be the one-dimensional Hausdorff measure 1-0 in M^ restricted 
to the rotated diagonal D = {{x, —x) : x G R} and weighted by the constant l/\/2, and 
we note that 

f{xi,X2)'H~^{d{xi,X2)) = / f{x,-x)dx 

for every "H --integrable / : M^ — )• M. O 

Example 3.10 (Gini's mean difference) If g{xi,X2) = \xi — X2I and F has a finite first 
moment, then Vg{F) equals Gini's mean difference E[|Xi — X2I] of two i.i.d. random 
variables Xi and X2 with df F. We will now verify that, if F and the estimator Fn 
satisfy Assumptions [3TM3.2I for this g, and d^{Fn,F) is P-a.s. finite for all n G N and 
some weight function (j) satisfying J l/(/>(x) dx < cxd, then the assumptions of Lemmas 
and 13.61 hold true and we have 



dgi^Fix) = dg2,F{^) = C^Pi^) - ^)dx (15) 

as well as 

dg{xi,X2) = -2nl){d{xi,X2)) (16) 
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with the notation of Reniark l3.9i Notice that (jlSp shows in particular that the V-statistic 
corresponding to Gini's mean difference is typically non-degenerate w.r.t. (g, F) in the 
sense of Section [21 

It was shown in Example 3.1 in iBeutner and Zahld ( 20121 ) that gi^pix) = —E,[Xi] + 
X + 2 J (1 — F{y)) dy for i = 1,2. Therefore, our assumption J l/(f){x) dx < oo implies 
J l/(j){x) \dgi^F\{x) < oo for i = 1,2. From this, and using Remark l3.5l (iv). we obtain the 
validity of assumpt i ons (b )-(d) of Lemma [3.41 It was also established in Example 3.1 in 
Beutner and Zahld (J2012l ) that ([15]) holds true. We next focus on ([T6[l and assumptions 
(b)-(d) of Lemma 13.61 



As for (b): It was already shown in Example 3.1 in [Beutner and Zahld ([201 2[ ) that 
gx^ and gx2 are locally of bounded variation for the above g. Further, for an arbitrary 
rectangle TZa,b = (aii 02] x (^ii ^2] with 02 < 61 we have 



\a2 - hi + |ai - 6i| 



m 



\a2 -h\ =0. 



The same holds for all rectangles TZa,b = (01,02] x (61,^2] with 62 < ai. Now, consider 
a rectangle TZa,b = (01,02] x (61,62] with 02 > 62 > oi > 61. Then 

|a2 - 62I + |oi - 6i| - \ai - 62I - |a2 - 6i| = 2(ai - 62) < 0. 

Similar inequalities hold for the remaining cases, i.e. for 02 > 62 > 61 > ai, 62 > 02 > 
61 > oi, and 62 > a2 > oi > 61. Hence, Remark l3.7l fvi) implies g G ]BViqj,j.j,. 

As for (c): Let /Ug denote the measure generated by Gini's mean difference kernel. We 
just have seen that fig{TZa,b) = for an arbitrary half-open rectangle TZa,b not intersecting 
the diagonal. Moreover, we have seen that Hg{TZa,b) = 2(01 — 62) when 61 < oi < 62 < 02. 
Taking the other possibilities mentioned above into account, we find that 



f^gO^a,! 



2(max{ai,6i} - min{a2,62}) , 7^a,6 n D / 
, else 



Thus, in view of Remarks I3.8H3.9[ the measure fig generated by Gini's mean difference 
kernel differs from Ti]-, only by the sign and the factor 2, i.e. (|16p holds. In view of Remark 
I3.7l (ii). this implies condition (c) of Lemma [3.6[ because we assumed d^{Fn,F) < 00 
P-a.s. for all n G N and some weight function (p with f l/(/)(x) dx < 00. 

As for (d): It was shown in Example 3.1 in [Beutner and Zahld ([20121 ) that dg^Xx) = 



\fxi,oo]{x)dx and dg~.{x) = lioo,Xi]{x)dx for i = 1,2. From this and the obvious 
convergence of \xi — X2\/{4>{xi)(f){x2)) to zero as |xi|,|x2| — >• 00, along with Remark 
I3.7l (ii). it can be deduced easily that all limits in condition (d) of Lemma [3.61 exist and 
equal zero P-a.s. O 
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Example 3.11 (Variance) If g{xi,X2) = 2(^1 "^2)^ and F has a finite second moment, 
then Vg{F) equals the variance of F. We will now verify that, if F and the estimator 
Fn satisfy Assumptions [XTH3. 21 for this g, and d^{Fn,F) is P-a.s. finite for all n G N and 
some weight function (j) satisfying J \x\/(j){x) dx < 00, then the assumptions of Lemmas 
and 13.61 hold true and we have 



dgi,F{x) = dg2,F{x) = (x — E[Xi])(ix 



(17) 



as well as 



dg{xi,X2) 



dxi dx2- 



(18) 



Notice that (|17p shows in particular that the V-statistic corresponding to the variance 
is typically non-degenerate w.r.t. {g, F) in th e sense of Section [^ 

It was already verified in Example 3.2 in iBeutner and Zahld ( 20121 ) that gi^p{x) = 



^x^ — xE[Xi] + ^E[X^] for i = 1,2. Therefore, our assumption ^ \x\ / ct){x) dx < 00 
implies J l/(j){x) \dgi^F\{x) < 00 for i = 1,2. Thus, gi^p G EVioc,rc and condition (c) of 
Lemma 13.41 follows at once. Moreover, assumption (d) of Lemma 13.41 h olds b y Remark 
I3.5l (v). It was also established in Example 3.2 in iBeutner and Zahld (|2012l ) that ()17p 
holds true. We next focus on psp and assumptions (b)-(d) of Lemma 13.61 

As for (b): It was already shown in Example 3.2 in IBeutner and Zahld (|2012l ) that gx^ 



and gx2 are locally of bounded variation for the above g. Moreover, we have g G ^^ * loc re 



Wt 



by Remark 13. 7l (v). 

As for (c) and (d): Notice that g{xi,X2) 



-X\X2 + \. 



.2(1 ^2 
'1 ">" 2-^2 



and recall Remark 



3.81 Thus, up to the sign, the measure generated by the variance kernel is equal to 
the L ebesgue measure on R^, i .e. (jlSp holds. Moreover, it was verified in Example 



3.2 in 

(Xj - x) 1 



Beutner and Zahld (|2012l ) that dg'^. 



(x 



Xi 



l(x„oo](2;)dx and dg-^.i 



X] 



,j..i(x)dx for i = 1,2. Thus, we see from Remark 13. 7l (ii). that conditions 
(c) and (d) of Lemma 13.61 hold. O 



Now, let us turn to some examples where the linear part of the representation ([6]) 
vanishes. 

Example 3.12 (Gini's mean difference, degenerate case) The V-statistic Vg{Fn) cor- 
responding to Gini's mean difference kernel g{xi,X2) = |xi — X2I is degenerate w.r.t. 
{g, F) for any df F that assigns probability 1/2 to two points in M. Indeed, we know 
from P^j) that dgi^pix) = dg2,F{x) = (2F(x) — l)dx for all x G M, so that f{Fn{x) — 
F{x)) dgi^F{x) = 0, i = 1,2, in this case. Recall that F„ refers to the empirical df, and 
notice that the assumptions of Lemma F3.6I triviallv hold, because {Fn — F){x) equals 
zero for x small and large enough, respectively. O 
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Example 3.13 (Goodness-of-fit test) For a given df Fq and any measurable (weight) 
function if : R — >■ M_|_, the weighted Cramer-von Mises test statistic 



rpU 



w{x)(Fn{x)-Fo{x)) dFo{x) 



was introduced for testing the null hypothesis F = Fq, and coincides with the classical 
Cramer-von Mises test statistic for w = 1. The test statistic T^ can be expressed as 
V-statistic Vg{Fn) with kernel 

g{xi,X2) := w{x)[l[^^^^){x) - Fo{x)j[l[^^^^){x) - Fo{x)j dFo{x). (19) 



We will now verify that, if Fq is continuous and satisfies Assumptions [3TTVI3.2I for this 
g and if the integral J w{x) dFQ{x) is finite, then under the null hypothesis F = Fq the 
assumptions of Lemmas 13.41 and 13.61 hold true and we have 



9l,Fo = 92,Fo 







(20) 



as well as 

dg{xi,X2) = 'Hi^ap^{d{xi,X2)) (21) 

with the notation of Remark 13.91 

Equation (j20p follows by using Fubini's theorem. Hence, the assumptions of Lemma 
3.41 trivially hold in this case. We next focus on (|21|) and assumptions (b)-(d) of 
Lemma 13.61 From (|19p we easily see that under our assumptions the sections gx^ and 
gx2 a re right-continuous a.nd lo cally of bounded variation. Moreover, ()2ip is known 
from IPehling and TaqquI ( 199ll . Example 3), and so it is apparent that g € ]BViqj,j.j,. 
Hence, (b) holds. From ([2T]l . (fT3|) . and the fact that we assumed j w{x) dFQ{x) to ex- 
ist, we immediately obtain (c). Further, we note that dg^.{x) = 'w{x)Fq{x) dFQ{x) and 
dg~.{x) = 'w{x)l[xi,oo){x) dFQ{x) for i = 1,2. From this and Remark l3.7l fi). which can 
be applied since Fn is the empirical df, it can now be easily deduced that all limits in 
condition (d) of Lemma 13.61 exist and equal zero P-a.s. O 



Example 3.14 (Test for symmetry) The test statistic 



F„(-t) - [1 - Fnit- 



dt 



(22) 



Mattih 



ilal (|l99d ) 



is often used for testing symmetry of F about zero; cf. lArcones and Ging (119921 . Example 

with dfi = 



5.1). Using Fubini's theorem, more precisely Theorem 1.15 in 

dFn X dFn and its analogue for negative integrands, T„ can be expressed as V-statistic 

Vg{Fn) with kernel 

g{xi,X2) := (kll A|x2|)(^l{:ri,X2>0} + l{xi,X2<0} - l{xi>0,a-2<0} -l{xi<0,X2>0})- (23) 
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We will now verify that, if F satisfies Assumptions [3T113.2I for this (7, and d^{Fn,FQ) is 
P-a.s. finite for all n G N and some symmetric weight function with J l/(j)[x) dx < 00, 
then under the null hypothesis that F be symmetric about zero the assumptions of 
Lemmas 13.41 and 13.61 hold true and we have 

gi,F = 92,F = (24) 

as well as 

dg{xi,X2) = 'H\){d{xi,X2)) - 'H~(d(xi,X2)) (25) 

with the notation of Remark 13.91 

The identity (fM|) is obvious (under the null hypothesis), and so the assumptions of 
Lemma 13.41 triviallv hold in this case. We next focus on (j25|) and assumptions (b)-(d) 
of Lemma 13.61 From ()23p we easily see that the sections Qx^ and gx2 are locally of 
bounded variation. Moreover, we have dg'^ {xi,X2) = 'H\){d{xi,X2)) and dg^ {xi,X2) = 
T-0jr{d{xi,X2)), and so it is apparent that g G BVfoj._j.^, and that ([25]) holds. Hence, (b) 
holds. From ()25p . ()13p and our assumptions we immediately obtain (c). Further, it can 
be checked easily that for i = 1,2 

\ -^x,fl]{x) dx + l[Q_x,]{x) dx , Xi<Q ' 

From this, our assumptions and the fact that F„ is the empirical df, it can now be easily 
deduced that the first two limits in condition (d) of Lemma 13.61 exist and equal zero 
P-a.s. Using our assumption J l/(f)(x) dx < 00 and Remark 13.71 (ii), it follows that the 
third limit in condition (d) of Lemma 13.61 exists and equals zero P-a.s. O 



3.3 Weak (central) limit theorems 

In this section we give a tool for deriving the asymptotic distribution of V-statistics, 
which is suitable for independent and weakly dependent data. Moreover, in some par- 
ticular cases it also yields a nontrivial asymptotic distribution for strongly dependent 
data. Recall that (V, d) is some metric subspace of D, and that V = P n V. 

Theorem 3.15 Let (an) be a sequence of positive real numbers with an — ?• 00. Assume 
that 

(a) the assumptions of Lemmas \3.4\ and \3.6\ are fulfilled. 

(b) on V the functions ^i.g, i = 1,2,3, defined in ^, are well-defined, {V,B(M))- 
measurable and d-continuous, 
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(c) the process an{Fn — F) is a random element of (V, V) for all n € N, and there is 
some random element B° of (V, V) such that ¥[B° G 5] = 1 for some d-separable 
S &V c onsisting of {V ,d)- completely regular points (in the sense of Definition 



IV. 2. 6 i mPollard 1(1984 )) only, and 



an{Fn-F)^B° ^n(V,V,d). (26) 

Then the following assertions hold: 
(i) We always have 

2 . 

aniVg{Fn) - Vg{F)) ^ -^ / B°{x-)dg,^F{x) m (M,S(M)). (27) 

(a) If the V-statistic Vg{Fn) is degenerate w.r.t. {g,F), then we additionally have 

al{Vg{Fn)-Vg{F)) ^ Ij B°{xi-)B°{x2-)dg{xi,X2) ^n (M,i3(M)). (28) 

Proof (i) Under condition (a) we can apply Lemmas 13.41 and 13. 61 to obtain the represen- 
tation dHD. From the Continuous Mapping Theorem, which is applicable by conditions 
(b) and (c), we obtain that 

^i,g{an{Fn-F)) A - I B°{x-)dg,^F{x), i = l,2. 

From Slutsky's lemma we obtain that ^/a^{Fn — F) converges to zero in probability, 
and so, according to the Continuous Mapping Theorem, $3,g(y^(i^n — F)) converges 
in probability to zero as well. Applying once again Slutsky's lemma finishes the proof 
of part (i). 

(ii) If the V-statistic is degenerate, then we obtain analogously by applying Lemma 
Eland Lemma ESI that al{Vg{Fn) - Vg{F)) = $3,g(a„(F„ - F)). The result is now 
immediate from the Continuous Mapping Theorem. □ 

Remark 3.16 If, for some weight function (p, the integral J l/(l){x)\dgi^F\{x) is finite 
for i = 1,2, then the mappings ^i^g and <I>2,g are obviously d^-continuous. Moreover, if 
the integral J l/(0(xi)(/)(x2)) \dg\{xi,X2) is finite, then $3^^ is dj^-continuous, too. O 

Example 3.17 (i.i.d. data) Suppose Xi,X2, ■ ■ ■ are i.i.d . with df F, and let (j ) be a 



weight function. If j cfP'dF < 00, then Theorem 6.2.1 in IShorack and Wellneii ( 19861 ) 
shows that for the empirical df Fn of Xi , . . . , X„ , 

VTl{Fn -F)^B°F (in {B^,V^,d^)), (29) 



21 



where Bp is an F-Brownian bridge, i.e. a centered Gaussian process with covariance 
function r{x,y) = F{x Ay) F{xV y). O 



Example 3.18 (weakly dependent data) Let (Xi) be a- mixing with mixing coefficients 
satisfying a{n) = 0{n~^) for some 9 > l + -\/2, and let A > 0. li F h as a finite 7-n i oment 
for some 7 > g^, then it can easily be deduced from Theorem 2.2 in lShao and Yul ( 19961 ) 
that 

V^(F„-F)4 5^ (in(D^,,P^,,d^J) 

with Bp a continuous centered Gaussian process with covariance function 



(30) 



Tis,t) 



F{sAt)F{sVt) 

00 

+ ^[Cov(l|Xi<.},l{x,<t}) 



+ Cov(l|Xi<t},l{x,<.})]; (31) 



k=2 



cf. Section 3.3 in 
4.1 in 



Beutner and Zahld (J2010|). If (Xj ) is even 0- or p- mix ing, then Lemma 
Chen and Fan. (|2006l ) and Theorem 2.3 in IShao and Yul (119961) ens u re th at the 
mixing condition can be relaxed; see also Section 3.2 in lBeutner and Zahld (J2012l ). O 



Part (ii) of Theorem 13.151 also leads to some interesting corollaries that provide con- 
venient alternatives to the common approach to derive the asymptotic distribution of 
degenerate U-statistics. Before presenting them, it is worth recalling that the common 
approach to derive the asymptotic distribution of degenerate U-statistics is based on a 
series expansion of the kernel g of the form 



g{xi,X2) = '^Xki^kixi)i^kix2), 



k=l 



where the A^ a re real numbers and the ipk are an orthonormal sequence; see, for example, 
Serflind ([l980|, Section 5.5). The A^ and the ipk are the eigenvalues and eigenfunctions, 
respectively, of the operator A : L'^{R,B{R),F) -^ L'^{R,B{R),¥) defined by 



A{h{xi)) 



g{xi,X2)h{x2)dF{x2). 



The eigenvalues arise in the asymptotic distribution of n{Ug^n — ^(-^)) which, in the 
i.i.d. case, is given by Yl'i^i ^ii^t ~ !)> where the S,i are independent and have a x^- 
distribution with 1 degree of freedom. 
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Corollary 3.19 Assume that the conditions of Theorem \3.15\ hold and that we are in 
case (a) of this theorem. Moreover, let the (possibly signed) measure generated by g on 
M? be equal to the product measure of (possible signed) measures ui and V2- Then 

al{Vg{Fn)-Vg{F)) A (^j B°{x-)diyi{x))(^j B°{x-)du2{x)) m (M,e(M)). 

The next example shows that two well-known kernels are special cases of Corollary 

Km 



Example 3.20 (i) The variance kernel g{xi,X2) = \{xi — X2Y = —3^1X2 + \x\ + ^x\ 



is degenerate if and only if the fourth cen t ral m oment equals the squared second central 



moment; see, for example. Ivan der VaartI (|l998l . Example 12.12). Moreover, in Example 



I3.11l we have seen that the measure generated by the variance kernel coincides with the 
negative Lebesgue measure on R^. So, in the degenerate case, the variance kernel can 
be treated by means of Corollary 13.191 

(ii) The kernel g{xi,X2) = 2:1X2, which corresponds to the characteristic E[Xi]^ and 
which is degenerate if the first moment equals zero, obviously generates the Lebesgue 
measure. In particular, up to the sign, it generates the same measure on R^ as the 
variance kernel. So, in the degenerate case, this kernel can be treated by means of 
Corollary 13.191 as well. Of course, for the corresponding V-statistics the asymptotic 
distributions can be derived differently, but the continuous mapping approach reveals 
an interesting relation to the variance kernel. O 

Recall from (1121) the definition of the measure Ii}, 



Corollary 3.21 Assume that the conditions of Theorem \3.15\ hold and that we are in 
case (ii) of this theorem. Moreover, let the measure generated by g be given by 11^ n- 
Then 

al{Vg{Fn)-Vg{F)) ^ fw{x){B°{x-)ffi{dx) in (R,^(M)). 

Here are two examples to which Corollarv 13.211 can be applied. 

Example 3.22 (i) In Example 13.121 we have seen that Gini's mean difference is degen- 
erate for a df that assigns probability 1/2 to two points in R. Further, from Example 
13.101 we also know that the measure generated by g differs from T~L\ £ = 'H\) only by a 
constant factor. So, Corollarv 13.211 can be applied. 

(ii) In Example 13.131 we have seen that the measure generated by the kernel g of the 
Cramer-von Mises statistic (cf. ([19]) ') equals the measure T-L^j^F- Thus, the Cramer-von 
Mises statistic can also be tr e ated by means of Corollrav 13.211 For the particular case 



w = 1 see also Ivan der VaartI (jl998l . Corollary 19.21). O 
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Although, the asymptotic distributions in Example 13.221 can be derived differently, the 
two examples given are appealing from a structural point of view. 

3.4 Strong limit theorems 

In this section, we focus on almost sure convergence of the plug-in estimator Vg{Fn) 
to Vg{F). Assume that the representation (j6]) holds, that the mapping $ : V — ;■ M, 
/ I—)- (X]j=i ^i,gif) + ^3,g{f)), is (i-continuous at the null function, and that Fn — F 
converges P-a.s. to the null function w.r.t. d. Then we immediately obtain almost sure 
convergence oiVg{Fn) to Vg{F). From the following obvious theorem we can even deduce 
the rate of convergence. By local /3-Holder d-continuity of a functional <I> : V — t- M at / we 
mean that |$(/„) - $(/)| = 0{d{fn, ff) for each sequence (/„) C V with d(/„, /) -^ 0. 

Theorem 3.23 Let (^ : M — )■ [l,oo) he some weight function, let d he homogeneous, and 
assume that 

(a) the assumptions of Lemmas \3.4\ and \3.6\ are fulfiUed. 

(h) on V the functions ^i^g, i = 1,2,3, defined in ^, are well-defined and (V,;B(M))- 
measurahle, and the function X]i=i ^i,g ^■s locally f5-Hdlder d-continuous at the null 
function for some /3 > 0, 

(c) the process Fn — F is a random element of (V, V) for all n G N, and, for some 
sequence {an) in (0, oo), 

an d{Fn -F,0) — ^ P-a.s. (32) 

Then 

4 {Vg{Fn) - Vg{F)) -^ P-fl.S. 

Remark 3.24 If, for some weight function 0, the integral / </>(x) \dgi^F\{x) is finite for 
z = 1,2, then the functionals <&i^g and $2,g are obviously locally 1-Holder dj^i-continuous 
at the null function. Moreover, if the integral J l/{4){xi)4){x2)) \dg\{xi,X2) is finite, then 
the functional $3^^ is obviously 2-Holder d(^-continuous at the null function. Thus, in 
this case the functional Ylii=i ^i,g i^ locally 1-Holder d^-continuous at the null function, 
and the rate of convergence of degenerate V-statistics w.r.t. {g,F), i.e. of V-statistics 
with J2i=i ^ii^n — F) = 0, is twice the rate of non-degenerate V-statistics w.r.t. {g,F). 

O 

The following examples illustrate condition ()32p . For any nonincreasing function h : 
M+ — )■ M+, we let h~*{y) := sup{x G M+ : h{x) > y}, y £ M+, be its right-continuous 
inverse, with the convention sup0 := 0. 
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Example 3.25 Let (/> be any weight function, and r G [0, ^). If the sequence (Xj) is 
i.i.d. and j (j)^'^^~ "idF < oo, then ([32]) hold for the we i ghted sup-metric d := d^, the 
empirical df -F„ := Fn, and an := n*"; cf. I Andersen et alj (|l988l . Theorem 7.3). O 



Example 3.26 Suppose that J (j)dF < oo. Further suppose that (Xi) is a-mixing with 
mixing coefficients a{n), let a{t) := a{[t\) be the cadlag extension of a(-) from N to 
M+, and assume that 



log {1 + a-^{s/ 2)) G^{s)ds < oo 



(33) 



for G := 1—G, where G denotes the df of 0(Xi). It was shown in lZahld (|2012l ) that, under 
the imposed assumptions, ([32]) holds for the weighted sup-metric d := d,f,, the empirical df 
Fn := Fn, and a^ := 1. Notice that (p3]l holds in particular i f K\(j ) (X^) l og"*" (/'(^i)] < oo 
and a{n) 
p. 924). 



0{n ) for some arbitrarily small 'd > 0; cf. I Rid ( 19941 . Application 5, 

O 



Example 3.27 Suppose that the sequence (Xi) is a-mixing with mixing coefficients 
a{n). Let r G [0, 2) and assume that a{n) < Kn~^ for all n G N and some constants 
K > and i9 > 2r. Then (j32p holds for th e uniform sup-metric d := doo, the empirical 



df Fn := Fn, and a^ := n^; cf. IZahld (|2012l l 



O 



3.5 Extensions and limitations 

The important extension of U-s tatistics based on one-sample to A;-sample problems has 
been made bv iLehmannI (jl95ll ). Here, we briefly show that the continuous mapping 
approach can also be applied to these statistics. We exemplarily discuss the two-sample 
case. Let Xi, . . . , X^ and Yi, . . . , Yn^ be two independent samples from the df F and 
G, respectively. A two-sample U-statistic based on the kernel h of degree (1, 1) is given 
by 



U, 



h,ni,n2 



-^^Mx.,y,), 



nin2 ■' 



(34) 



i=i j=i 



and it is an estimator for 



U, 



h,{F,G) 



h{x,y)dF{x)dG{y), 



provided that the latter quantity exists; see, for instance. ISerflingl (|l98Cl . Section 5.1.3). 
A well-known kernel of degree (1, 1) is, for example, h{x,y) = x — y. It corresponds to 
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a two-sample comparison of the means and it is easily seen that it fulfills the conditions 
below for the continuous mapping approach to two-sample U-statistics. 

Of course we can rewrite ([M|) as Uh,ni,n2 = ff h{x,y) dFn^(x)dGn2iy)- Thus, under 
the appropriate set of conditions (imposing the conditions of Lemma 13.41 on hcix) := 
J h{x,y) dG{y), hp{x) := J h{x,y)dF{x) and on Fm — F and G^j — G, and those of 
Lemma 13.61 on h and F^ — F and G„2 — G), we obtain 



Uh,nj_,n2 - Uh,(F,G) = - j {Fn^x-) - F{x-)) dhcix) - I {G n^iv -) - G {y -)) dh p {y) 

+ [[{FnAx-)-F{x-)){Gn2{y-)-G{y-))dh{x,y). (35) 



So Uh,ni,n2~Uh,{F,G) ^as a representation similar to ([!]). Now, suppose that \/rH(F„^ — F) 
converges in distribution to some B^ in (Vi, Vi, di), and that ^/n2{Gn2 ~ G) converges in 
distribution to some i?2 in (V2, V2,d2). Assume further that the mappings Ti^h '■ Vi — )• 
K, T2,/, : V2 ^ M and T3,;, : Vi x V2 ^ M, defined by Ti,;,(/) := f f{x-) dhcix), 
'^2,h{g) ■■= ! g{y-)dhF{y) and T3^h{f,9) ■■= If f{x-)g{y-)dh{x,y), respectively, are 
fii-, d2- and di x (i2-continuous, respectively. Then, in view of ([35]) . we obtain 



Vni +n2{Uh,ni,n2 - Uh,{F,G)) — ^ ~~ Bl{x-) dhcix) - -^—— / Bl{y-)dhF{y), 
provided that ni/(ni -|- 71,2) — )• p for some p G (0, 1). This is immediate by noting that 

Vni + n2 Ti_/j(F„, - F) = ^/{ni + n2)/ni Ti {y/n{{Fn^ - F)) , 
V^i -hn2 T2,/i(G'„2 -G) = y(rnTn^)7n2 T2,h(^/n^(G'„2 -G)), 
\/rai -h n2 Ts^/j (F„, - F, Gnj -G) = \/{ni +n2)/{nin2) 

xT3,h{V^{Fn,-F),,/^{Gn2-G)). 

Indeed, due to their independence the processes y/ni{Fn — F) and ^/rUiiGn — G) jointly 
converge in distribution to B = (i?i,i?2) and then, since y^(ni +n2)/{nin2) — )■ as 
ni -|- n2 — >• 00, and ni/(ni + n2) -^ p, the result follows from Slutsky's lemma. Hence, 
one of the advantages of the continuous mapping approach continues to hold for two- 
sample U-statistics. 

The following remark reveals a limitation of the continuous mapping approach. It is 
worth mentioning that it seems that the onlys crucial point for an application of Theorem 
l3.15l or its analogue for two-sample U-statistics is that the kernel does not have too many 
discontinuities. 

Remark 3.28 (i) A well known two-sample U-statistic is the Wilcoxon and Mann- 
Whitney two-sample test statistic 

ni n2 

Uh,ni,n2 = / . / AiX,<Y,} 
i=\ j=l 
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introduced by IWilcoxoij (| 19451 ) and iMann and WhitnevI (|l947l ). Notice that C//i,ni,n2 ■= 
t^/i,ni,n2/("i™2) is an estimator for ff l^^^yjdF{x)dG{y) = ¥[Xi < Yi], where (Xi) 
and (Yi) are sequences of identicahy distributed random variables with continuous dfs 
F and G, respectively. The statistic Uh,ni,n2 can be used for testing the hypothesis 
F = G. Of course, Uh,nx,n2 is a two-sample U-statistic with kernel h{x,y) := l^x<y} 
of degree (1,1). It is easy to check that the sections hx{-) = h{x,-) and hy{-) = h{-,y) 
are of bounded variation and thus fulfill the second part of assumption (b) of Lemma 
3.61 Moreover, the functions /if(-) = J h{x,-) dF{x) and hoi-) = J h{-,y) dG{y) satisfy 
condition (b) of Lemma 13.41 However, h as a function from M? to M is not locally 
of bounded variation. Indeed, take for example the square (0, 1] x (0, 1] and the grid 
= xo < 1/n < . . . < (n — l)/n < x^ = 1 and = yo < 1/^ < . . . < (n — l)/n < yn = 1 
for n > 2. Then 



n 



'^\h{xi,yi) -h{xi_i,yi) - h{xi,yi-i) + /i(xi_i, yi_i) 



i=l 

n n 



< '^'^\Hxi,yj) - h{xi-i,yj) - K^i,yj-i) + h{xi-i,yj-i)\ 

for every n > 2. 

(ii) A similar reasoning applies to the (one-sample) kernel g{xi,X2) := I{xi+X2>0}- 
The corresponding U-statistic 



1=1 j=i 

is known to be asymptotically equivalent to the Wilcoxon signe d rank test for t esting 
whether the distribution is centered at zero; see, for instance, Ivan der VaartI (|l998l . 
Example 12.4). However, g as a function from M^ to M is not locally of bounded variation. 
Indeed, take for example the square (—1,0] x (0, 1] and the grid — 1 = x„ < — 1 + 1/ 
n < . . . < — 1 + (n — l)/n < xq = and = yo < 1/"- <...<! — l/n < y„ = 1 for 
n > 2. Then again 



n 



for every n > 2. 



'^\9{xi,yi) - g{xi-i,yi) - g{xi,yi-i) + g{xi-i,yi-i) 



1=1 

n n 



< X]X]l^(^*'2/j) -9{xi-i,yj) - g{xi,yj-i) + g{xi^i,yj^i) 
i=i j=i 



O 
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4 The use of the representation (Ej) for linear long-memory 
sequences 

As indicated in the introduction and in Section [51 for sequences exhibiting long-range 
dependence it may happen that the hnear part of the von Mises decomposition degener- 
ates only asymptotically. In such a case, Theorem 13. 151 mav not yield a non-central limit 
theorem; for more details see the discussion below just after the proof of Theorem 14.11 
Nevertheless, representation ([6]), the Continuous Mapping Theorem, and an "expansion" 
of the empirical process will lead to a general result to derive non-central limit theorems 
for U- and V-statistics based on linear long-memory sequences. Thus, in this section, we 
shall consider a linear process exhibiting long-range dependence (strong dependence), 
i.e. 

oo 

Xt ■.= ^aset-s, ten, (36) 

s=0 

where (ej)jez are i.i.d. random variables on some probability space (il, J-", P) with zero 
mean and finite variance, and the coefficients a^ satisfy X^^o ^s < oo (so that (Xt)tgN is 
an L^-process) and decay sufficiently slowly so that Yl't^i |Cov(Xi,Xi)| = oo. The latter 
divergence gives the precise meaning to the attribute long-range dependence. Notice that 
if ei has a finite pth moment for some p >2, then the same holds for Xi. As before, we 
denote by F the df of the Xf. 

For n S N and p G Nq, assume that the pth moment of F is finite and that F can be 
differentiated at least p times. Denote the jth derivative of F by F^^', j = 0, . . . ,p, with 
the convention F^^' = F, and define a stochastic process £n,p;F with index set M by 



£n,pA-) ■■= F4.)-Y,{-lyF(f\.)[^Y.^rAX^)] 

j=0 i=\ 

P -. n 

= F„(.)-F(.)-^(-1)^F(^)(-)(-E^,;f(X,)), (37) 






where Aj-p denotes the jth order Appell polynomial associated with F, and we use 
the convention Ylj=ii''') '■— 0- Recall that these Appell polynomials are defined by 
^0;f(2;) := 1 and for j = 1, . . . ,p recursively by the characteristic conditions 

—Aj.f{x) = jAj^i-Fix) and / Aj.F{y)dF{y) = 0. 

Notice that in particular 

£n,0;F{-) = {Fn{-)-F{.)), 

£n,l;Fi-) = {Fni-)-Fi.))+F('\.)[-^X, 

i=l 
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For p £ N, we obviously have 



£. 



n,p- 



-1;F 



(•) 



1 " 

i=l 



(38) 



and we note that under a suitable re-scaling the limit in distribution, Zp^fj, o f the nor- 
malized sum ^Yl^=i^p;F{Xi) has been established bv lAvram and TaqquI ( 19871 ) for 
1 < p < 1/(2/3 — 1) (for the meaning of /3 see Theorem 14. II below). So, whenever the pro- 
cess £n,p\Fi-) can be shown to converge in probability to zero under the same re-scaling, 
we obtain that the limit in distribution of a re-scaled ver sion of the process £ „ p-^■.F(■) is 
given by (—1)^ F^P\-) Zp^jj. This idea is basically due to lOehling and TaqquI ([l989) who 
considered the Gaussian case and the uniform sup-metri c dop. For the linea r process 



and th e uniform sup- metric d^o this approach was used bylHo and Hsingl (119961) a nd 



Wu 



( 20031 ) for arbitrary p > 1, and bv ICiraitis and SurgailisI (j 19991 ) for p = 1. IWul (J2003l ) 



also considered bounds for the second moment of weighted sup-norms of the leading 
ter m of £^n.p- 1 : F ( • ) • Fo r the l inear process and the weighted su p- metric d^, th e appr oach 



of lDehling and TaqquI (j 19891 ) was applied to the case p = 1 bv lBeutner et al.l (J2012l ). In 
the following theorem we generalize the latter to arbitrary p>l. 

Theorem 4.1 Let p G N, A > 0, and assume that 

(a) as = s~" £{s), s G N, where /3 G (g, 1) and i is slowly varying at infinity. 

(b) E[|ei|(4+2A)V(2p)] ^^ 

(c) The df G of ei is p + 1 times dijjerentiable andY^^--^^J^\G^^\x)\'^4>2x{x)dx < oo. 

(d) p{2(3-l)<l. 
Then 

{nP(^''^^hinrP}£r^,p-i;Fi-) -^ i-irF^P\-)Zp,^ (in iB^^,V^^,d^,)), (39) 
where 

Zp,p := Cp^p j { I l(u,,i){v) f[{v - Uj)-^dv} W{dui) ■ ■ ■ W{dup) 

J -oo<ui<---<Up<l ^ Jo j^i ^ 

with W a white noise measure (i.e. an additive Gaussian random set function satisfying 
K[W{B)] = and K[W{B) n W{B')] = \B n B'\ for all B, B' £ B{R)) and 



Cp,f} :-- 



p\{l-p{(3-l)){l-p{2l3-l) ) ^ 1/2 
Jq°°(x + x^)~l^dx 
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Remark 4.2 (i) The infinite moving average representation of an ARFIMA(p, d, q) 
process with fractional diff erence pa r amete r d G (0, 1/2) satisfies assumption (a) with 
/3 = 1 — d; see, for instance, iHoskingl (jl98ll . Section 3). 

(ii) Here we have chosen to define the stochastic process (|37|) in terms of -F„, F and 
the Appell polynomials of F, because this allows us later to define the statistic (|43p in 
terms of the df of the observables. However, we conjecture tha t assumption (b) can be 
relaxed to E[|ei|*^^"'"^^)] < oo by replacing as in lWul (J2003l . l2006l ) the Appell polynomials 
Aj-F{Xi) by the expressions A~p'"'' (Xi) (to be in troduced at the begin ning of the proof 
of Theorem 14. ip and by the method of proof as in iBeutner et alJ (|2012l ) where this was 
done for p = 1. 

(iii) Condition (c) implies in particul ar that the df F of Xi is p times differentiable 
with F(p) G D^^; cf. inequality (30) in |^ (J20oi) with n = oo, k = 1 and 7 = 2A. 
Further, assumption (c) can be relaxed in that it suffices to require that there is some 
m G N such that the df Gm of ^2^=0 ^s^m-s is p + 1 times differentiable and satisfies 



.(i). 



y^^-X Tg \Gm {x)\'^ (i)2\{x) dx < 00. The proof still works in this setting; see also [Wu 

(I2003I ). O 



Proof (of Theorem I4.ip It was shown by lAvram and Tag qui ()l987l . Theorem 1) that 
the pth Appell polynomial of F evaluated at X^ has the representation 



^p;F{Xi) 



E E 



p\ 



qi\ 



E n <\A.;g(^ 



i-rUkJ 



i=i gmen,,j, m{£)eA,(^) fc=i 

p 

»"(p)eAq(p) 

+E E ^^ E n<^»«fe 



k=l 



i-rrik. 



A 



=1 q(e)eUe^p 



m{^)(^\(e) k=l 



7^(1,.. .,1), 



-:^-"{x,) + a;.^-'>{x,), 

where Aq^-^c denotes the g^th Appell polynomial of the df G of ei, and, for every i G 
{1, . . . ,p}, n^p is the set of all q{l) = (gi, . . . , (7^) G N^ satisfying qi + • • • + qi = p and 
I < qi < ■ ■ ■ < qt- Moreover, for a given q{(!.) = (gi, . . . , g^) we denote by ^q[t) the set of 
all m{i) = {nil, ■ ■ ■ , m-e) £ ^0 such that rrii ^ rrij for i ^ j and, in addition, if qi = qi+i, 
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then rrii < rrij+i. So, introducing a telescoping sum, we obtain from ([38 



i=l ^ i=l 

j=i i=i 

i=l 



i=l 



Under assum ptions (a), E[|£nP^| < oo, and (d), it follows from Step 3 in the proof of 
Theorem 2 of lAvram and TaqquI (|l987l ) that the expression 






i=l 



converges in probability to zero for every j = l,...,p. So, in view of F"' G ^4>x^ 
j = 1, . . . ,p, we obtain 

|^p{/3-l/2)^(^)-p}rf^Jr„,p(.),0) -^ 0. 



Avram and TaqquI (jl987l . Theorem 2) also showed that under the same assumptions the 
expression 



K(^-V2)^(„)-P}i^^=g.->i)(x, 



«=i 



n^-p(P-y^)i(n)P ^ 



T.<F '\x. 



converges in distribu tion to ^p,/3; for the shape of the normalizing constant Cp^/3 see 
Ho and Hsind ( 19961 . Lemma 6.1). So, in view of F^P' e '^4)xj the process Un,p{-) con- 
verges in distribution to (—1)^ F^P'{-) Zp^p w.r.t. d^^. In the remainder of the proof we 
will show that 

{nP^^-"^^^{nr^}d^^{SlJ ■),()) ^ (40) 
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so that assertion ()39p will follow from Slutzky's lemma. 

For ([I0|) to be true, it suffices to show that dfp^{ ^'^ , 0) converges in probability to 
zero, where Sn,p '■= n S^^p and an,p := n^~''^^^~^''^'i{n)P. Under the assumptionsja), 
E[|eo| ] < oo, (c) and (d), we have from Theorem 2 and Lemma 5 of IWul ( 20031 ) 

that 



E 



d^2xiSn,p{r^^) = 0(n(logn)2 + H„,p) 



(41) 



with Hj; 



n,p -^ C'(n2-(P+i)(2/3-i)^(n)2(p+i)) [notice that there is a typo in Lemma 5 oflWu 
(I2OO3I) where it must be p{2/3 - 1) < 1 instead of (p + 1)(2^ - 1) < !]• From (jH]) we 



obtain by the Markov inequality for some constant C > and every e > 



a. 



n,p 



XSn,pi-),0) > e 



< 



'^n,p "'<i'2X\^n,p{-) ,0) > e 

1 E[d<^,,(5„,p(•)^o)] 



a, 



n,p 



< Ce' 



< Ce- 



n(logn)2 + „2-(p+l)(2;3-l)^(^)2(p+l) 



n2-p(2/3-i)£(n)2p 



(log n) 



n 



l-p{2li-l) 



+^y^^f- 



Due to assumption (d), the latter bound converges to zero as n — )• 00. That is, 
d(j,^{ "''' ; 0) indeed converges in probability to zero. □ 



Combining Theorems 13.151 and 14.11 one can in principle easily derive the asymptotic 
distribution of non-degenerate and degenerate V-statistics based on linear long-memory 
sequences. For non-degenerate V-statistics (as, for instance, Gini 's mean diffe rence from 
Example I3.10p one can apply part (i) of Theorem 13.151 see also iHsind (J200d ) who uses 
a different approach for Gini's mean difference. For degenerate V-statistics (as, for 
instance, the Cramer-von Mises statistic from Example I3.13P one can apply part (ii) 
of Theorem 13.151 However, in the long-memory case the situation is often more com- 
plex because several V-statistics based on long-memory sequences systematically de- 
generate asymptotically. For instance, the (sample) variance with corresponding kernel 
g{xi,X2) = ^{xi — x-i)^ is typically non-degenerate w.r.t. {g,F) (cf. Example 13. lip , but 
in this case the integral on the right-hand side in (|27p with B°{-) = {—l)F^^'{-)Zi^j3 
equals 

2 2 

Y,JB°{x-)dgi^F{x) = Zi,^^ /'F«(x-)(x-E[Xi])(ix = 0. 



j=i 



i=l 



Indeed, from Example 13.111 we know that in this case dgi^pix) = dg2^F{x) = {x — 
]E,[Xi])dx holds, and hence f F^^'){x-){x - E[Xi])dx = /F(i)(x)(x - E[Xi])dx = 0. 
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That is, in the long- memory case the sample variance regarded as a V-statistic is asymp- 
totically degenerate w.r.t. {g,F, {n^^~^'^' i{n)^^)n) in the sense of Section [2l and so an 
application of part (i) of Theorem l3.15l vields little. Moreover, part (ii) of Theorem llj.lSl is 
not useful neither in this case, because part (ii) of Theorem l3.15l is based on the fact that 
the linear part in the representation ([6]) vanishes. However, this is not the case here, since 
the sample variance is not (finite sa mple) degenerate w. r .t. (q ,F). This is in accordance 
with the remarkable observation of lDehling and TaqquI (|l99ll ) that in the long- memory 
case both terms of the von Mises (respectively Hoef Fding) decomposition of th e sample 
variance contribute to the asymptotic distribution. iDehling and TaqquI (|l99ll ) consid- 
ered the sample variance based on Gaussian long- memory sequences. From the following 
Corollarv 14.31 we cannot only derive the analogue for linear long- memory sequences (see 
Example 14. 7|) , but can also derive the asymptotic distribution of more general asymp- 
totically degenerate U- and V-statistics based on linear long- niemory sequences ( see, for 
instance. Examples 14.61 1^^ and 14. 9p . We note that recentlv iLevv-Leduc et al.l (J2011al ) 
also derived the asymptotic distribution of some asymptotically degenerate U-statistics 
(with bounded kernels) based on Gaussian long-me mory sequences us i ng diff erent tech- 
niques. For further applications of their results see iLevv-Leduc et al.l (J2011bl ). 



Corollary 4.3 Let F be a df on the real line, and g : M" ^ M be some measurable 
function. Assume that the representation ^ with Fn '■= Fn holds for F and g, and that 

2 

y^ / (l)-\{x)\dgi^F\{x) < oo and jj (t)-\{xi) (l)^x{x2)\dg\{xi,X2) < oo (42) 
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holds for some A > 0. Let p,q,r £ N, set 

Vn,g;p,gAFn) := Vg{Fn) - Vg{F) (43) 

2 p—1 1 " r 



e=i j=i i=i 



J=l i=l 

X jj F^^\xi-){Fn{x2-)-F{x2-))dg{xuX2) 

r—l ^ n 

A:=l i=l 

X ff{Fnixi-)-F{xi-))F^'\x2-)dg{xi,X2) 

j=l fc=l i=l i=l 

X ff F(^\xi-)F^^\x2-)dgixi,X2) 



using, as before, the convention X^o=i(''') := 0, and assume that all integrals on the 
right-hand side in |^31 ) are well defined (which is in particular the case if F is max{p, q, r} 
times difjerentiable with F^^' G D^^ for all k = 0, . . . , max{p, q, r}). 

(i) Assume q + r > p and that the assumptions (a)-(c) of Theorem \4.l\ with p replaced 
by raax{p,q,r} < 00 hold for the same A. Then, if in addition s(2/3 — 1) < 1 holds for 
each s G {p,q,r}, we have 

2 . 

£=1 ■' 

with Zp^jj defined as in Theorem \4.1\ 

(a) Assume q-\- r = p and that the assumptions (a)-(d) of Theorem \4-l\ hold for the 
same A and p. Then we have 

{nP(''-V2)^(n)-P}V„,,;p,g,,(F„) A (-l)PZp,^^ /F(P)(x-)ci5,,i.(x)+ (44) 

i-ir Z,,pZr,is II F^'^\x^-)F^'^\x2-)dg{xuX2) 
with Zs^i3 defined as in Theorem \4 ■ 1\ for s G {p,q,r}. 
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Recall that assumption (c) of Theorem 14. 1 1 implies that F is max{p, q,r} times differ- 
entiable and that all derivatives up to the max{p, q, r}th derivative lie in D,^^ . 

Remark 4.4 The random variables ^p,/3, -^g,/3 and Zr^/s in part (ii) of Corollarv 14.31 are 
dependent. The specification of their joint distribution seems to be an open problem. 
Only in the Gaussian case the Joint cumulan t s of Z f o and ^2^/3 are known from the 
supplementary material to iLevv-Leduc et al.l (J2011al ). Not ice that it is even hard to 
specify the (Rosenblatt) distribution of ^2,/3; for details see IVeillette and TaqquI (J2012l ) 



O 



Proof (of Corollarv 14. 3p Using the representation ([6]) of Vg{Fn) — Vg{F), we obtain 



''g,n;p,q,r\^ ) 



2 r 

Y, [Fn{x-)-F{x-)]dg,^F{2 

4 = 1 •' 



+ // {Fn - F){xi-) {Fn - F){X2-) dg{xi,X2) 



2 p-1 



q—1 ^ n 



7 = 1 i=l 



i=i 



r-1 



X // F^^\xi-){Fnix2-)-F{x2-))dg{xi,X2) 



i=l 

X jj{Fn{xi-) - F{xi-)) fW(x2-) dg{xuX2) 

i=l i=l 

X jj F^^\xi-)F^^\x2-)dg{xi,X2) 

2 r 

^ / 8n,p-i-F{x-) dgi^F{x) 



fc=l 

q—1 r— 1 

j=i k=i 



+ // £n,q-l-F{xi-) £n,r-l;F{x2-) dg{xi,X2). 

Moreover, by Theorem 14. H 

{n<P-'/^h{n)-^}£n,s-iA-) ^ {-iycs,pZ,^pF^^\.) (in (B<^„P^„d^J) 
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for s = p,q, r. Therefore, assertion (i) follows from the Continuous Mapping Theorem 
and (j42p as well as Slutzky's lemma and the assumption q + r > p. Moreover, assertion 
(ii) follows from the Continuous Mapping Theorem, (|42|) and the assumption p = q + r. 

D 

It is worth pointing out that, as mentioned at the beginning of this section, Vn,g;p,q,r{Fn) 
is obtained using the representation ^ and an "expansion" oi Fn — F in the sense of 
([H7|) . Obviously, with increasing p, q or r, the expression Vn,g;p,q,r[F) defined in (|15|) 
is getting more and more involved. So, for statistical applications one should choose p, 
q and r as small as possible. On the other hand, p has to be chosen so large so that 
the limit in distribution of {'nP^^~^''^' (.{n)~^}Vn,g;p,q,r{F) does not vanish. That is, an 
application of Corollary 14.31 requires a trade-off between the simplicity of the statistic 
Vn,g-p,q,r{F) and the benefit of the asymptotic distribution. A particularly favorable sit- 
uation is the one where some (or preferably all) terms on the right-hand side of (j43p . 
which are different from Vg{Fn) — Vg{F), vanish. This is the case if the respective in- 
tegrals f F^^'{x—) dg£^p{x) etc. vanish, for instance, in the case of the sample variance 
and in the case of the test for symmetry; cf. Examples 14.71 and 14.91 In other situations 
the statistic Vn,gyp,q,r{F) might be more complicated than Vg{Fn) — Vg{F). Yet, it seems 
to be among the best achievable results. Finally, notice that in cases where part (i) or 
part (ii) of Theorem 13.151 air eadv yields a nontrivial asymptotic distribution, the result 
can also be derived by Corollarv 14.31 This is exemplified in the next remark. 

Remark 4.5 (i) For Gini's mean difference take p = q = r = 1. Then the result 
obtained from part (i) of Corollary 14.31 coincides with the result we get from part (i) of 
Theorem EZni 

(ii) For the weighted Cramer- von Mises statistic take p = 2 and q = r = 1. Then, 
under the hypothesis that F = Fq, we have from part (ii) of Corollary 14.31 that the 
asymptotic distribution equals 

Zi^pZi^p jj F^^\xi-)F^^\x2-)dg{xi,X2) = {Zi,pf I w{x){F^^\x-)fdF{x), 

where we used Equations ()20p and ()2ip . That is in accordance with Example 13.221 O 

Gini's mean difference discussed in part (i) of the preceding remark is an example for 
an asymptotically non-degenerate U- or V-statistic. The weighted Cramer-von Mises 
statistic discussed in part (ii) of the preceding remark is an example for an asymptoti- 
cally degenerate U- or V-statistic of type l.a) in the sense of Section [21 The following 
two Examples 14.61 and 14.71 provide some asymptotically degenerate U- or V-statistics of 
type l.b) and type l.c), respectively. Examples 14.81 and 14.91 below will provide some 
asymptotically degenerate U- or V-statistics of type 2 in the sense of Section [2l 
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Example 4.6 (Squared absolute mean of a symmetric distribution) The kernel g{xi,X2) 
= xi • X2 for estimating the squared mean has been investigated repeatedly in the lit- 
erature. Let us consider here the related kernel g{xi,X2) = \xi\ ■ \x2\ for estimating the 
squared absolute mean of a distribution F having a finite first moment. In this case, 
we obtain gi^pix) = E[|Xi|] • |x| and hence dgi^pix) = E[|Xi|](— Ij^^g} + ^{x>o})dx for 
i = 1,2. Moreover, we have dg{xi,X2) = (l{xi>o,x'2>o} - l{xi<o,x2>o} - l{xi>o,x2<o} + 
l{xi<o, X2<0}) dxidx2- It is then easily checked that the conditions of Lemmas l3.4l and l3.6l 
are fulfilled for any weight function (j) with J l/(f){x) dx < oo. Now, let us in addition as- 
sume that F^^' is symmetric about 0. Then, on one hand. Theorem 13. 15l (i) with B°{-) = 
(_l)i?(i)(.)^^^^ yields that the limiting distribution of {nP'^/'^l{n)-^}{Vg{Fn) - Vg{F)) 
is given by 



-^ I B°(x-)dgi^F{x) = 2Zi^pE[\Xi\](- I F^^\x-) dx + f F^^^x-) dx) = 0. 

On the other hand, part (ii) of Theorem 13. 151 is not helpful neither because it only yields 
that the limiting distribution of {n'^^^~^/'^h{n)~'^}{Vg{Fn) — Vg{F)) is given by 

jj B°{xi-)B°{x2-)dg{xi,X2) = ZlJj F^^\xi)F^^\x2)dg{xi,X2) = 0, 

where for the latter "=" we used the symmetry of F^^' . However, if we take p = 2 and 
g = r = 1, we have Vn,g;2,i,i{Fn) = yg{Fn) — VgiF) aiid obtain by Corollary I4.3l fii) 

{n'P-H{nr^}{Vg{F^)-Vg{F)) 

2 

A Z2^p fl [F^'\^-)dgiA^) + zip ljF^^\x,-)F^^\x2-)dgix,,X2) 
e=i ■' •'•' 

/O /"OO 

F(2)(x)dx + / F(2)(x)dx) 

= AZ2,p(^j" F^^\x)dx), 

where for the latter "=" we used the antisymmetry of F^"^^ (i.e. F^'^'{x) = —F^^'{—x)) 
which holds by the symmetry of F'^^K This shows that in the present case Vg{Fn) is an 
asymptotically degenerate V-statistic w.r.t. ((7, F, (n(^~^/^)^(n)~-^))) of type l.b) in the 
sense of Section [2j O 



Example 4.7 (Variance) As discussed above, in our long-memory setting we can nei- 
ther apply part (i) nor part (ii) of Theorem 13.151 to derive a nontrivial asymptotic 
distribution for the sample variance; recall that the sample variance is a V-statistic with 
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corresponding kernel 5(a;i,X2) = ^(xi — a;2)^- However, part (ii) of Corollary 14.31 enables 
us to derive a nontrivial asymptotic distribution. From Example 13.111 we know that 
dgi,F{x) = dg2,F{x) = [x — E,[Xi]) dx, which implies J F^^>{x—) dg£^p{x) = 0. So we 
have Vn,g;2,i,i(-^) = VgiFn) — ^g{P) ^^d obtain by Corollarv l4.3l (ii) 

2 

A Z2,p Yl I F^^Hx-)dg,,F{x) + Zip If F^'\xi-)F('\x2-)dg{x^,X2) 
e=i ■^ •^•' 

= 2Z2,i3 f F^^\x-){x-E[Xi])dx - (Zi^p I F^^\x)dx\^ 

= 2Z2,p j F^^\x-){x-nXi])dx - [Zi^p)\ (45) 

where for the first "=" we used the fact that dg(xi,X2) is the negative of the Lebesgue 
measure on R^; cf. Examples 13.111 Notice that J F^^'{x—)dg£^p{x) = holds for every 
(sufficiently smooth) df F, so that (|45p is fully satisfactory even in a non-parametric 
setting. Notice also that in the present case Vg{Fn) is an asymptotically degenerate 
V-statistic w.r.t. {g.,F, {n^/^~^/'^H{n)~^)) of type l.c) in the sense of Section[2j O 

Example 4.8 (Scaling sequence (a^)j Let us consider the kernel g{xi,X2) = xi(|x2| — 1), 
and suppose that F'^^^ is symmetric about zero and that m := E[|Xi|] = 1. This 
setting is somewhat artificial but it leads to quite an interesting limiting behavior of 
the corresponding U- or V-statistic. We have gi^pixi) = xi{m — 1) = and g2,F{x2) = 
E[Xi](|x2| — 1) = due to the assumption tti = 1 and the symmetry of F^^' , respectively. 
That is, Vg{Fn) is a degenerate V-statistic w.r.t. {g,F), and consequently an application 
of part (i) of Theorem 13.151 does not lead to a non-degenerate limiting distribution. 
Moreover, it is easily seen that dg{xi,X2) = {li^x2>o} ~ ^{x2<o}) dxidx2- Therefore, 
part (ii) of Theorem 13.151 does not provide a tool to derive a non-degenerate limiting 
distribution, because under the imposed assumption the right-hand side in ()28p with 
B°{-) = {-l)F^^\-)Zi^p vanishes. 

On the other hand, part (ii) of Corollary 14.31 enables us to derive a nontrivial asymp- 
totic distribution. In contrast to the sample variance in Example 14.71 however, it does 
not make sense to work with Vn,g;2,i,i{F), because in this case the limit in (|44p van- 
ishes. Indeed, the first summand of the limit vanishes since gi^p = 52, F = 0, and the 
second summand of the limit vanishes since under the imposed assumptions the integral 
jj F^^'{xi—)F^^>{x2—)dg{xi,X2) equals zero. As a consequence we need to work with 
a p larger than 2. For instance, for {p,q,r) = (3,1,2) we obtain from Corollarv I4.3l fii) 
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and gi^F = 52,F = that 

2 

A -Z3,;3 Y. I F^^\x-)dg,^F{x) - Z^,fiZ2,fi ff F^'\xi-)F(^\x2-) dgix,, X2) 
e=i ■^ ■^■' 

= -Zi^pZ2,p[J'" J FW(xi)F(^\x2)dxidx2- J jFW(xi)F(^\x2)dxidx2 

= -2Zi^pZ2,p F^^\x2)dx2, 
Jo 

which is typicaUy distinct from zero. Notice that above we may replace Vn,g;3,i,2(-^) by 
Vg{Fn) - Vg{F) since 



Vn,g;3,l,2(i^) 



2 2 



V,{F^)-V,{F) + EE(-l)' (^E^^;^(^^)) fF^'Hx-)d9i,F{x) 
e=i j=i i=i •' 

+ {-Y. ^lAXi)) IJ{Fn{xi-) - F{xi-)) f(i) (3:2-) dg{xi,X2) 

Vg{Fn)-Vg{F) 

+ (- ^ Ai-AXi)) ( / / {Fn{xi-) - F{xi-)) F(i)(x2) dxidX2 

(F„(xi-) - F(xi-)) F«(x2) dxidxs) 

where we used gi^p = 92, F = 0, the continuity of F^^' , and the symmetry of F^^^' about 
zero. Thus, in the present case Vg{Fn) is an asymptotically degenerate V-statistic w.r.t. 
{g,F, (n('^"^/2)^(n)-i)) of type 2 in the sense of Section [2 O 

The next example shows that it might even not be sufficient to take the scaling se- 
quence (n^'^~^'^^) to obtain a non-degenerate limiting distribution. 

Example 4.9 (Test for symmetry, scaling sequence (a^) j Let us come back to the test 
statistic Tn defined in ([22|) . which is a V-statistic with kernel given by ([25]) . We re- 
strict to the null hypothesis that the distribution is symmetric about zero. We have 
seen in Example 13.141 that in this case we obtain gi^p = 92,F = and dg{xi,X2) = 
'H\,{d{xi^X2)) — 'H\^{d{xi, X2)) ■ That is, under the null hypothesis, T„ can be seen as a 
degenerate V-statistic. So, in principle, we could apply Theorem 13. 15l (ii) to derive the 
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asymptotic distribution of T„, = Vg{Fn)- However, the integral on the right-hand side in 
(f28l) with B°{-) = {-l)FW(.)Zi^f} equals 



11 B°{xi)B°{x2)dg{xi,X2) = ZlJ I F^^\x)F^^\x) dx- I f'^^\x)F^^\-x) dx\ = 0, 

(46) 
because F^^' is symmetric about zero. Now one might tend to apply Corollary 14.31 as 
in Example 14.71 i.e. with {p,q,r) = (3,1,2), to obtain a nontrivial limiting distribution. 
However, the integrals on the right-hand side of ()44p equal zero in that case. Indeed, 
the first one equals zero, because gi^p = g2,F = 0. The second one, which is given by 

equals zero, because of the symmetry of F^^' and the antisymmetry of -F*-^^ . However, 
applying part (ii) of Corollary 14.31 with {p, q, r) = (4, 2, 2) we obtain (using again that 
9i,F = g2,F ^ 0) 



n--.(„)-V„„,..,^„) A Z|,(/ ^<«(.,^.=.(.,.. - / ^.=.(.,^.=.(-.)<fa) 

POO 

= iZlJ {F^){x)fdx 



by the anti-symmetry of F^"^' . Using the symmetry of F^"^' and once again that gi^p = 
92,F = it can be easily checked that Vn,g-i,2,2{Fn) = Vg{Fn) - VgiF). O 
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